2,676 research outputs found

    Fast k-means based on KNN Graph

    Full text link
    In the era of big data, k-means clustering has been widely adopted as a basic processing tool in various contexts. However, its computational cost could be prohibitively high as the data size and the cluster number are large. It is well known that the processing bottleneck of k-means lies in the operation of seeking closest centroid in each iteration. In this paper, a novel solution towards the scalability issue of k-means is presented. In the proposal, k-means is supported by an approximate k-nearest neighbors graph. In the k-means iteration, each data sample is only compared to clusters that its nearest neighbors reside. Since the number of nearest neighbors we consider is much less than k, the processing cost in this step becomes minor and irrelevant to k. The processing bottleneck is therefore overcome. The most interesting thing is that k-nearest neighbor graph is constructed by iteratively calling the fast kk-means itself. Comparing with existing fast k-means variants, the proposed algorithm achieves hundreds to thousands times speed-up while maintaining high clustering quality. As it is tested on 10 million 512-dimensional data, it takes only 5.2 hours to produce 1 million clusters. In contrast, to fulfill the same scale of clustering, it would take 3 years for traditional k-means

    Language Models for Image Captioning: The Quirks and What Works

    Full text link
    Two recent approaches have achieved state-of-the-art results in image captioning. The first uses a pipelined process where a set of candidate words is generated by a convolutional neural network (CNN) trained on images, and then a maximum entropy (ME) language model is used to arrange these words into a coherent sentence. The second uses the penultimate activation layer of the CNN as input to a recurrent neural network (RNN) that then generates the caption sequence. In this paper, we compare the merits of these different language modeling approaches for the first time by using the same state-of-the-art CNN as input. We examine issues in the different approaches, including linguistic irregularities, caption repetition, and data set overlap. By combining key aspects of the ME and RNN methods, we achieve a new record performance over previously published results on the benchmark COCO dataset. However, the gains we see in BLEU do not translate to human judgments.Comment: See http://research.microsoft.com/en-us/projects/image_captioning for project informatio

    Systematic study of proton radioactivity of spherical proton emitters within various versions of proximity potential formalisms

    Full text link
    In this work we present a systematic study of the proton radioactivity half-lives of spherical proton emitters within the Coulomb and proximity potential model. We investigate 28 different versions of the proximity potential formalisms developed for the description of proton radioactivity, α\mathcal{\alpha} decay and heavy particle radioactivity. It is found that 21 of them are not suitable to deal with the proton radioactivity, because the classical turning points rinr_{\text{in}} cannot be obtained due to the fact that the depth of the total interaction potential between the emitted proton and the daughter nucleus is above the proton radioactivity energy. Among the other 7 versions of the proximity potential formalisms, it is Guo2013 which gives the lowest rms deviation in the description of the experimental half-lives of the known spherical proton emitters. We use this proximity potential formalism to predict the proton radioactivity half-lives of 13 spherical proton emitters, whose proton radioactivity is energetically allowed or observed but not yet quantified, within a factor of 3.71.Comment: 10 pages, 5 figures. This paper has been accepted by The European Physical Journal A (in press 2019

    Progressive Domain-Independent Feature Decomposition Network for Zero-Shot Sketch-Based Image Retrieval

    Full text link
    Zero-shot sketch-based image retrieval (ZS-SBIR) is a specific cross-modal retrieval task for searching natural images given free-hand sketches under the zero-shot scenario. Most existing methods solve this problem by simultaneously projecting visual features and semantic supervision into a low-dimensional common space for efficient retrieval. However, such low-dimensional projection destroys the completeness of semantic knowledge in original semantic space, so that it is unable to transfer useful knowledge well when learning semantic from different modalities. Moreover, the domain information and semantic information are entangled in visual features, which is not conducive for cross-modal matching since it will hinder the reduction of domain gap between sketch and image. In this paper, we propose a Progressive Domain-independent Feature Decomposition (PDFD) network for ZS-SBIR. Specifically, with the supervision of original semantic knowledge, PDFD decomposes visual features into domain features and semantic ones, and then the semantic features are projected into common space as retrieval features for ZS-SBIR. The progressive projection strategy maintains strong semantic supervision. Besides, to guarantee the retrieval features to capture clean and complete semantic information, the cross-reconstruction loss is introduced to encourage that any combinations of retrieval features and domain features can reconstruct the visual features. Extensive experiments demonstrate the superiority of our PDFD over state-of-the-art competitors
    • …
    corecore